214. Elastic API建立 Index_template

#🗒

前言

幫同事測試不同LOG的刪除時間，
每次要去點一點又怕忘記，
弄個API了，
不知道之後能不能在部屬時直接掛上去，再研究看看。
這次就先簡單建立個 index template跟life cycle了

正文

首先要先確定你的filebeat有將資料丟到elastic裡面，可參考57.filebeat 補充說明。

再來根據你建的是indices還是datastream，來決定你的index_template。
下面的指令，都在dev tools中執行。

API後面的最後一個路徑為名稱

indices

API建立 Index Lifecycle Policies

PUT _ilm/policy/worker
{
  "policy": {
    "_meta": {
      "description": "used for worker log",
      "author": "Ezio",
      "project": {
        "name": "srs",
        "department": "lerouge"
      }
    },
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "set_priority": {
            "priority": 100
          },
          "rollover": {
            "max_age": "3d",
            "max_primary_shard_size": "50gb"
          }
        }
      },
      "delete": {
        "min_age": "3d",
        "actions": {
          "delete": {
            "delete_searchable_snapshot": true
          }
        }
      }
    }
  }
}

建立 index_template

PUT _index_template/worker
{
  "index_patterns" : ["worker-*"],
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "videoworker"
        }
      }
    }
  },
  "composed_of": ["ecs@mappings", "logs@mappings","logs@settings"],
  "priority" : 200,
  "version": 1,
  "_meta": {
    "description": "test by Ezio",
    "latest_modify_date": "2024-05-28"
  }
}

檢查 Index Lifecycle Policies 及 index template

GET _ilm/policy/worker  #  Index Lifecycle Policies
GET _index_template/worker # index template

ref.

data stream

建立 index template
與上面的差異點在於，有沒有增加data_stream與 template.lifecycle.data_retention

PUT _index_template/videowork
{
  "index_patterns" : ["videoworker-*"],
  "template": {
    "lifecycle": {
      "data_retention": "3d"
    }
  },
  "data_stream": { },
  "composed_of": ["ecs@mappings", "logs@mappings","logs@settings"],
  "priority" : 200,
  "version": 1,
  "_meta": {
    "description": "test by Ezio",
    "latest_modify_date": "2024-05-28"
  }
}

ref. Create a data stream with a lifecycle

indices vs data stream

下面資料是用gpt-4o提供的

特性	Indices	Data Streams
管理方式	手動管理或使用 ILM	自動管理滾動和索引
數據存儲	適合任意結構數據	專為時間序列數據設計
對應組態	每個索引獨立組態	透過索引範本自動組態
滾動策略	需手動或 ILM 組態	自動滾動
查詢性能	基於標準索引	優化時間序列數據查詢
索引創建	使用 `PUT /index_name`	透過索引範本自動創建
適用數據	任意數據，包括非時間序列數據	主要為日誌、指標等時間序列數據
學習曲線	需要學習和管理索引的細節	簡化，特別適合大規模時間序列數據
生命週期管理	需要組態 ILM	自動管理
存儲效率	依賴手動優化	為時間序列數據設計，具備存儲效率

優缺點

項目	Indices	Data Streams
優點	- 更靈活，可用於任意類型數據	- 自動管理滾動和索引
	- 完全自定義對應和設置	- 簡化組態和管理
	- 支援複雜的查詢和聚合	- 優化時間序列數據查詢和存儲
		- 適合大規模日誌和指標數據
缺點	- 需手動管理索引和滾動策略	- 主要針對時間序列數據
	- 組態和管理相對複雜	- 不適合非時間序列數據
	- 需要組態和維護 ILM	- 對應和設置相對不靈活

使用場景	Indices	Data Streams
日誌管理	可以使用，但需要手動組態滾動和管理	最佳選擇，自動管理，簡化運維
應用監控	可以使用，但需自定義索引和對應	最佳選擇，專為時間序列數據設計
定期報表	適合任何類型報表	適合時間序列報表
歷史數據查詢	最佳選擇，可針對具體需求優化查詢	適合基於時間範圍的歷史數據查詢
複雜數據分析	最佳選擇，支援靈活的對應和查詢	適合時間序列數據的快速查詢和聚合
非時間序列數據存儲	最佳選擇，適合所有類型數據	不推薦，僅適合時間序列數據

結論

其實從API來看的話，直接用data stream比較快，也比較省事。
加上目前用的是ECK，
我根本沒去設計 warm phase 或 cold phase的硬碟。

ref. ES 的超前佈署 - Index Template

前言

正文

首先要先確定你的filebeat有將資料丟到elastic裡面，可參考57.filebeat 補充說明。

indices

data stream

indices vs data stream

優缺點

推薦使用場景

結論